Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dremio #367

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Dremio #367

wants to merge 18 commits into from

Conversation

maxfirman
Copy link

Overview

Adds Dremio support.

Update type - breaking / non-breaking

  • Minor bug fix
  • Documentation improvements
  • Quality of Life improvements
  • New features (non-breaking change)
  • New features (breaking change)
  • Other (non-breaking change)
  • Other (breaking change)

What does this solve?

Closes #366 .

Outstanding questions

The main outstanding issue is setting up a Dremio cluster for the CI pipeline integration tests. So far I've been testing this locally against our on-prem cluster. I've managed to get it building successfully against the integration test project and our in-house project.

There is potential for refactoring the upload macros as there is quite a lot of code duplication. I refrained from implementing this refactoring to keep this PR focused on just adding Dremio support, but it would make sense to handle that in a separate PR.

What databases have you tested with?

  • Snowflake
  • Google BigQuery
  • Databricks
  • Spark
  • Dremio
  • N/A

@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 1, 2023 16:12 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 1, 2023 16:12 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 1, 2023 16:12 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 3, 2023 12:34 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 3, 2023 12:34 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 3, 2023 12:34 — with GitHub Actions Failure
@glsdown
Copy link
Contributor

glsdown commented Jul 24, 2023

Hi @maxfirman . Thanks for taking the time to add this functionality in.

One of the team will spend some time reviewing it and get back to you.

@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 25, 2023 09:21 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 25, 2023 09:21 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests July 25, 2023 09:21 — with GitHub Actions Failure
@maxfirman
Copy link
Author

Hi @glsdown, thanks for taking the time to review my PR.

I can see that my changes caused a regression for upload_sources and dim_dbt__snapshots. I've just pushed a commit that will hopefully resolve those failures.

@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 15:43 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 15:43 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 15:43 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 15:43 — with GitHub Actions Failure
@glsdown glsdown had a problem deploying to Approve Integration Tests September 29, 2023 16:05 — with GitHub Actions Failure
@glsdown glsdown had a problem deploying to Approve Integration Tests September 29, 2023 16:05 — with GitHub Actions Failure
@glsdown glsdown had a problem deploying to Approve Integration Tests September 29, 2023 16:05 — with GitHub Actions Failure
@glsdown glsdown had a problem deploying to Approve Integration Tests September 29, 2023 16:05 — with GitHub Actions Failure
@glsdown glsdown had a problem deploying to Approve Integration Tests September 29, 2023 16:23 — with GitHub Actions Failure
@glsdown glsdown had a problem deploying to Approve Integration Tests September 29, 2023 16:23 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 20:32 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 20:32 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 20:32 — with GitHub Actions Failure
@maxfirman maxfirman had a problem deploying to Approve Integration Tests September 29, 2023 20:32 — with GitHub Actions Failure
@maxfirman
Copy link
Author

@glsdown thanks very for taking a look at this. I appreciate its a reasonable amount of work to set up integration testing, and I'm happy to help as much as I can.

I've tested your latest changes. I had to make a small patch, but I can now build the test project locally without errors.

There are a couple of ways to go in terms of setting up a Dremio instance for testing. The most straightforward approach, assuming you have access to an AWS account, would be to spin up a Dremio Cloud account and link it to your AWS account. You would then need to connect to a metastore in order to be able to create Apache Iceberg tables. Your two options would be either AWS Glue Catalog or Dremio's Arctic metastore. I don't have much experience with the later, so would probably suggest going with AWS Glue catalog. In either case you will also need to spin up an s3 bucket to store the actual data, and configure to be the hive.metastore.warehouse.dir.

The alternative approach would be to go totally DIY and use the standalone dremio-oss Docker image. The problem with this is that you would also need to spin up a Minio object store and a Hive Metastore in order to be able to create Iceberg tables. Getting everything configured and talking to each other which would probably require more effort compared to Dremio Cloud / AWS approach.

@maxfirman maxfirman had a problem deploying to Approve Integration Tests May 7, 2024 09:53 — with GitHub Actions Failure
@maxfirman maxfirman temporarily deployed to Approve Integration Tests May 7, 2024 09:53 — with GitHub Actions Inactive
@maxfirman maxfirman temporarily deployed to Approve Integration Tests May 7, 2024 09:53 — with GitHub Actions Inactive
@maxfirman maxfirman temporarily deployed to Approve Integration Tests May 7, 2024 09:53 — with GitHub Actions Inactive
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
@maxfirman maxfirman requested a deployment to Approve Integration Tests May 16, 2024 13:53 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Support Dremio
2 participants